Isolated Word Speech Recognition System Using Deep Neural Networks
نویسنده
چکیده
Speech recognition is the process of converting speech signals into words. For acoustic modeling HMM-GMM is used for many years. For GMM, it requires assumptions near the data distribution for calculating probabilities. For removing this limitation, GMM is replaced by DNN in acoustic model. Deep neural networks are the feed forward neural networks having more than one or multiple layers of hidden units. In this work, we have presented the isolated word speech recognition system using acoustic model of HMM and DNN. We are using Deep Belief Network pre-training algorithm for initializing deep neural networks. DBN is a multilayer generative probabilistic model with large number of stochastic binary units. The features used are the mel-frequency cepstrum coefficients (MFCC). Experimental results are calculated on TI digits database. Proposed system has achieved 86.06 % accuracy on TI digits database. System accuracy can be further increased by increasing the number of hidden units.
منابع مشابه
Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملSpeech Emotion Recognition Using Scalogram Based Deep Structure
Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملEstimation of Hand Skeletal Postures by Using Deep Convolutional Neural Networks
Hand posture estimation attracts researchers because of its many applications. Hand posture recognition systems simulate the hand postures by using mathematical algorithms. Convolutional neural networks have provided the best results in the hand posture recognition so far. In this paper, we propose a new method to estimate the hand skeletal posture by using deep convolutional neural networks. T...
متن کاملAn Experimental Speaker-independent System for Isolated Word Recognition Implemented for Romanian Language
The research presented in this paper is based on the artificial neural networks recognition paradigm applied to Romanian isolated word recognition. The network, which is composed by three layer (a Multilayer Perceptron), is trained by conventional Back-propagation algorithm. The ANN speech recognition system based on Mel Frequency Cepstral Coefficients was developed using Matlab toolkit. The sy...
متن کامل